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DISCRETE MALLIAVIN-STEIN METHOD: BERRY-ESSEEN BOUNDS 
FOR RANDOM GRAPHS AND PERCOLATION 


KAI KROKOWSKI, ANSELM REICHENBACHS, AND CHRISTOPH THALE 

Abstract. A new Berry-Esseen bound for non-linear functionals of non-symmetric and 
non-homogeneous infinite Rademacher sequences is established. It is based on a discrete 
version of the Malliavin-Stein method and an analysis of the discrete Ornstein-Uhlenbeck 
semigroup. The result is applied to sub-graph counts and to the number of vertices having 
a prescribed degree in the Erdos-Renyi random graph. A further application deals with a 
percolation problem on trees. 


1. Introduction 


The Malliavin-Stein method has become a versatile device for proving quantitative limit the¬ 
orems. It combines the Malliavin calculus of variations with Stein’s method. The results ob¬ 
tained this way typically fall into two categories. The first category consists of limit theorems 
for non-linear functionals defined on the Wiener space with notable applications to Gaussian 
random processes, especially the fractional Brownian motion 19, 20, 22 , random matrices 
\ 2l\ and random polynomials 111. The other brand comprises limit theorems for functionals of 
Poisson random measures and their applications to stochastic geometry (6 15 IT, 18,28,36 


[/-statistics |4[ 6 

16, 

00 

CM 

33 , non-linear statistics of spherical Poisson fields 5| and the theory 

of Levy processes 6 

,17 

,26;. 


On the other hand, the Malliavin-Stein method has left only few traces in that part of proba¬ 
bility theory in which discrete random structures are investigated. One exception is the paper 
[241, where Stein’s method for normal approximation has been combined with tools from dis¬ 
crete stochastic analysis for symmetric Rademacher sequences to deduce quantitative central 
limit theorems with respect to probability distances based on smooth test functions. Here, 
by a symmetric Rademacher sequence we understand an infinite sequence of independent and 
identically distributed random variables taking the values ±1 with probability 1/2 each. This 
approach has been extended in [l4| to deduce Berry-Esseen bounds, that is, estimates for the 
Kolmogorov distance in related central limit theorems. The applications considered in 14,^24] 
concern the number of two-runs, a quantitative version of a combinatorial central limit theo¬ 
rem as well as traces of powers of random Bernoulli matrices. While the previously mentioned 
papers were concerned with the symmetric case, we work with general non-linear functionals 
of non-symmetric and even non-homogeneous Rademacher sequences in order to bring a rich 
class of examples, that were not accessible before, within the reach of the Malliavin-Stein 
method. Moreover, we emphasize that some of the examples we present below are not within 
the reach of any of the traditional approaches using Stein’s method. 
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One of the main tools of the Malliavin-Stein method on the Wiener or the Poisson space 
is the so-called multiplication formula for multiple stochastic integrals, cf. 27 for a gen¬ 


eral overview. The main difficulty in the discrete set-up is that no such multiplication for¬ 
mula for discrete multiple stochastic integrals based on non-symmetric or non-homogeneous 
Rademacher sequences is available. Consequently, a new type of abstract Berry-Esseen bound 
needs to be developed, which is getting along without this technical device. Such a result, 
namely Theorem |4.1| below, is one of our main contributions. It can be interpreted as a kind 
of ‘second-order Poincare inequality’ and is the discrete analogue of corresponding results on 


the Wiener or the Poisson space, cf. 17,23 . It relies on a generalization of the Malliavin-Stein 


bound established in jl4| and on an analysis of the discrete Ornstein-Uhlenbeck semigroup. 
To make this approach work, we also have to develop further some facets of the discrete 
Malliavin calculus of variations. In particular, we present a generalization of the integration- 
by-parts formula, which is one of our crucial devices. The Berry-Esseen bound we obtain this 
way is particularly well suited for the study of discrete random structures. This is due to the 
fact that the chaotic decomposition of the functional at hand does not have to be specified. 
Instead, the impact of local perturbations on the functional measured by means of a certain 
difference operator (discrete Malliavin derivative) has to be evaluated. A sufficient condi¬ 
tion for asymptotic normality is that moments of first- and second-order discrete Malliavin 
derivatives of the functional are sufficiently small. 

To highlight the versatility of our general limit theorem we now present a couple of concrete 
applications. The first one deals with the triangle counting statistic associated with the Erdos- 
Renyi random graph. Introduced in ( 7 ], the model has since then become one of the most 
popular models in discrete probability, cf. 1101 for an exhaustive list of references. Informally, 
the random graph G(n,p ) is a graph on n E N vertices in which each edge between two 
vertices is included with probability p E [0,1], independently of the other edges (for a detailed 
construction see Section [5] below and see Figure [I] for simulations). In what follows we allow 
p also to depend on n, but for practical reasons we suppress this in our notation. The 
random variable in the focus of our attention is the number T = T(n,p ) of triangles in 
G(n,p), i.e., the number of sub-graphs of G(n,p) that are isomorphic to the complete graph 
on 3 vertices. A comprehensive central limit theorem for the normalized random variable 

by the method of moments. In particular, it 


34 


F := (T—E[T])/y / Var[T] has been derived in 
provides a necessary and sufficient condition on n and p , which ensures asymptotic Gaussianity 
for F. Namely, as n —> oo, one has that 


F 


N if and only if np —> oo and n 2 (1 — p) — > oo , 


where N ~ A/"(0,1) is a standard Gaussian random variable and —> indicates convergence 
in distribution. Using Stein’s method for normal approximation, a rate of convergence in 
this central limit theorem measured by some sort of bounded Wasserstein distance has been 
established in [2], If p E (0,1) is fixed, 


d\(F,N) := sup 
h£H 


E[/i(F)] — E[/i(A r )] 
Halloo + Halloo 


Oin- 1 ), 


where 7i is the class of bounded functions h : R —> R with bounded first derivative and 
where || • ||oo denotes the supremum norm. For the case that p = 0n~ a with a E (0,1) and 
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0 G (0, n a ) such that 0x1 the result in |2 delivers the bound 


di(F,N) 


0(n ~ 1 + Q / 2 ) if 0 < a < ^ 
C?(n -3 ( 1-a )/ 2 ) if \ < a < 1. 


We use the following standard notation for comparing the order of magnitude of two real 
sequences: We write a n x b n for two real sequences (a n ) n eN and (0 n ) n eN whenever 


c < lim inf 

n—¥ oo 



< lim sup 

n—> OO 



< c 


for two constants 0 < c < C < oo. We also write a n = 0(b n ) for two non-negative sequences 
(®n)n£N and (& n )neN if there is a constant c G (0, oo) such that a n < cb n for sufficiently large 
n. Applying a standard smoothing argument, one can show that the more prominent and 
more natural Kolmogorov distance 

d K (F,N) := sup \P(F < x) - P(N < x)\ 


between F and N is bounded by a constant multiple of the square-root of di(F,N), cf. 
29, Proposition 2.4]. However, this typically leads to a suboptimal rate of convergence for 


the Kolmogorov distance cIk(F,N). For example, in the special case of a fixed p G (0,1) 
one expects that also dx{F,N ) is of order n~ l . Our main contribution in this context is 
the following Berry-Esseen bound, which in particular confirms that this is in fact true. We 
emphasize that we are not aware of any other technique, which could be used to provide 
bounds on the Kolmogorov distance of this quality if p is of the form 0n~ a with a G (0,1) 
and 9 G (0, n a ) such that 0x1. In what follows, we treat both set-ups simultaneously and 
contribute thereby to a long standing problem in this area. 


Theorem 1.1. Denote by N ~ jV(0, 1) a standard Gaussian random variable. Let p = On “ 
with a G [0, 1) and 6 = 6 n G (0, n“) such that 0x1. Then 


d K (F,N) 


' 0(n~ 1 + a ) if 0 < a < \ 

< 0(rK 3//4 + a,/2 ) if \ < a < | 
0 ( n -5(l-a)/4) ^ | <a<1 


In particular, if p is constant, i.e., if a = 0, 


d K (F,N) = 0(n~ 1 ). 


To underline that not only triangle counts are within the reach of our methods, we now 
consider the problem of counting copies of general sub-graphs T in the Erdds-Renyi random 
graph G(n,p). Formally, we denote by S = S(n,p) the number of copies of T in G(n,p ) 
and by F := (S — E[«S'])/y / Var[ l S'] the normalized sub-graph counting statistic. We assume 
that r has at least one edge and in contrast to Theorem 1.1 we also assume that the success 
probability p G (0,1) is fixed and does not depend on n. In this situation, Theorem 2 in [[2] 
says that 

d 1 (F,N) = 0(n~ 1 ) 


and our abstract Berry-Esseen bound can be used to conclude that the di-distance can be 
replaced by the Kolmogorov distance. 
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Figure 1. Realizations of Erdos-Renyi random graphs with n = 50 vertices 
and p = 0.04 (left) and p = n -1 / 2 ~ 0.14 (right). The graphics were produced 
by means of the freely available R-package igraph. 


Theorem 1.2. Denote by N ~ jV(0,1) a standard Gaussian random variable and fix p E 
(0,1). Then 

d K {F,N) = 0{n~ l ) 

for all graphs T having at least one edge. 


It should be pointed out that Theorem |1.2| can alternatively be obtained from the sharp 
cumulant estimate in [8j, Proposition 10.3] together with 


Esseen bound for decomposable random variables in 32 


35, Corollary 2.1] or from the Berry- 
Theorem 5.1]. 

Besides the number of triangles or general sub-graphs, there are several other random variables 
associated with the Erdos-Renyi random graph that have found considerable attention in the 
literature. One statistic that has been object of much study is the number of vertices having 
a prescribed degree. For example, in 12 a central limit theorem for the number of isolated 
vertices was given, which for general degree is a result in 111 . A rate of convergence for the 


di-distance as introduced above has been obtained in [2]. A technically highly sophisticated 
version of Stein’s method was developed in [9] to deduce a corresponding Berry-Esseen bound 
in case that the success probability is p = 9/n. Using our general Berry-Esseen bound, we 
are able to present a quick and streamlined proof of an extended version of this quantitative 
central limit theorem. We denote for d E {0,1,2,...} by V n/ i the number of vertices of degree 
d in G(n,p) in case that the success probability satisfies p = 9n~ a for suitable a E M and 
9 E (0, n a ) such that 0x1. We finally define the normalized random variable G n ^ := 
(V n , d - E[K,d])/ V / Va?lK^J, n E N. 


Theorem 1.3. Denote by N ~ AA(0,1) a standard Gaussian random variable and fix d E 
{0,1,2,...}. Let p = 9n~ a with a E [1, 2) and 9 E (0, n a ) such that 0x1. Then 


d K {G n4 ,N) 


0(n~ 1+a / 2 ) ifd = 0,a E [1,2) 

0 ( n l/2-3d/2-a+3ad/2) ifd€N,a€[l, f^) . 
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In particular, if a = 1, 

d K (G n4 ,N) = 0(n~ V 2 ) 

for all d E {0,1,2,...}. 


Our final application deals with the number of connected components arising from bond perco¬ 
lation on a tree. We recall that a rooted tree HI is an undirected graph with one distinguished 
vertex, the root of the tree, in which any two vertices are connected by a unique self-avoiding 
path. We denote for n E N by HI n the sub-tree of HI , which consists of all vertices of HI that 
have graph-distance at most n from the root. By \HI n \ we denote the number of edges of HI n . 
In what follows, we assume that each vertex of HI has degree bounded by D + 1 with DeN 
and that HI has infinitely many vertices. If the degree of the root is D and if the degree of 
each other vertex of HI is D + 1 for some fixed D E N, we say that HI is a H-regular tree. Fix 
p E (0,1) and assign to each edge e of HI, independently of the other edges, a Rademacher 
random variable X e such that P(X e = +1) = p and P(X e = —1) = 1 — p. We now remove 
from HI all edges e for which X e = — 1 and indicate by H(p) the resulting random graph, see 
Figure [2] for a simulation. Its restriction to HI n is denoted by HI n (p) and we let C n (p ) be the 
number of connected components of HI n (p). Here, by a connected component we understand 
a maximal connected sub-graph of HI n (p) consisting of at least one edge; isolated vertices are 
not counted. Our next result is a Berry-Esseen bound for the normalized random variable 
H n (p ) := (C n {p) — E[C' n (p)])/y / Var[C' n (p)]. This adds to the qualitative central limit theorem 
in 13^. 


Theorem 1.4. Fix p E (0,1) and denote by N ~ AA(0, 1) a standard Gaussian random 
variable. Then 

d K (H n (p),N) = 0(\HI n \~ l/2 ). 

In particular, in case of a D-regular tree one has that 


d K (H n (p),N) 


0(n~ 1 G) ifp = 1 

0{D~ n / 2 ) ifD> 2. 


The rest of this paper is structured as follows. In Section [2] we collect some background 
material related to the discrete Malliavin calculus. An analysis of the discrete Ornstein- 
Uhlenbeck semigroup is the content of Section [3| This is used in Section [4] to derive our 
abstract Berry-Esseen bound, which in turn is applied in Section[5]to the Erdos-Renyi random 
graph and in Section [6] to the percolation problem on trees. These sections also contain the 
proofs of Theorem Theorem 1 1.2[ Theorem |1.3| and Theorem |1.4| presented above. 


Note added in proof-. After submission of the paper it came to our attention that a mul¬ 
tiplication formula for discrete multiple stochastic integrals based on non-symmetric and 
non-homogeneous Rademacher sequences has been developed in a manuscript by Privault 
and Torrisi that was not available to us, but has now appeared as (31 . See also 13 . 


2. Preliminaries 

2.1. Set-up. For each k E N let (1 < p k < 1 and put q k '■= 1 — p k - We abbreviate the 
sequences (pfc)fceN and (®t)fceN by p and q, respectively. By X := (X k ) ke ^ we denote a 
sequence of independent random variables such that 

P(X k = +1) = p k and P(X k = -1) = q k , 


k£ N . 
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Figure 2. ^ of a 2-regular tree 2? (left). Realization of 2^,{p) with p = 1/2 
(right). The colour red means that the edge is included, while green indicates 
that the edge has been removed. The graphics were produced by means of the 
freely available R-package igraph. 

This is what we call a (non-symmetric and non-homogeneous) sequence of independent Rade- 
rnacher random variables. We construct them in the canonical way by taking (D, J 7 , P) as 
probability space, where D := {—1,+1} N , T := V{{— 1, +1})®^ and P := + 

q^t 5_i), with V{M) being the power set of a set M and <5±i being the unit-mass Dirac measure 
concentrated at ±1. We then put Xk(co) := for each k E N and w := (LOk)k£N £ fT Note 
that Xy. has mean py. — qu and variance 4 PkQk- 

2.2. Discrete multiple stochastic integrals. Denote by k the counting measure on N and 
put f 2 (N)® n := L 2 (N n ,P(N)® n ,K® n ) for n E N. In the following, we refer to the elements of 
l 2 (N) 8n as kernels. By £ 2 (N) on we denote the class of symmetric kernels and ^(N) 071 stands for 
the sub-class of symmetric kernels vanishing on diagonals, i.e., vanishing on the complement 
of the set A n := {(ii,... ,i n ) £ N n : 4 / for k / £}. We further put ^ 2 (N)®° := M. 

For n E N and a kernel / E tQ(N) on we define the discrete multiple stochastic integral of order 
n of / as 

Jn(f) '■= n\ 

l<il<-..<*n<0O 

where (Tfc)fceN with Y*. := (Xy. — p k + qk)/(^y/PkQk) stands for the normalized sequence of 
independent Rademacher random variables as introduced above. We also put Jq(c) = c for 
c E M. The space spanned by the random variables of the form J n {f) with / E £g(N) OTl is 
called the Rademacher chaos of order n. 

Discrete multiple stochastic integrals of different orders are mutually orthogonal and satisfy 
the isometry relation 

Jmiq)] = l{n=m} ^-(/’5)t 2 (N)® n (2-1) 

for all n, m E N and kernels / E ^q(N) 0 ”, g E £q(N) 0,ti . Moreover, it is a classical fact 
that every F E L 2 (Q) (i.e., every square-integrable Rademacher functional) admits a chaotic 
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decomposition 


F = E[F] + Mfn 


( 2 . 2 ) 


n= 1 


30 


for uniquely determined kernels f n G ^g(N) on , where the series converges in L 2 (fl), cf. 
Proposition 6.7]. Together with the isometry relation for discrete multiple stochastic integrals 
this decomposition implies that the variance of F is given by 

oo 

Var[F] = X] n! ll^llm®"- 

72—1 

2.3. Malliavin calculus. In this section we introduce some basic notions from discrete Malli- 

for further details and background material. Let F G L 2 {Pl). 


30 


avin calculus and refer to 
The discrete gradient of F in direction k G N is given by 

D k F := y/p k q k { F k ~ F k ), ( 2 -3) 

where F±(u) := F(ui, ... ,u k - i, ±l,u k +i, ■ ■ .). Note that the normalization in (2.3) is chosen 
such that D k Y k = 1. 

The discrete gradient satisfies the following product formula. Namely, if F, G G L 2 {Pl) and 
k G N, then 

X k 


D k (FG) = (D k F)G + F(D k G) - 


: (D k F)(D k G ) 


(2.4) 


VPkQk 

see |30, Proposition 7.8]. We remark that in contrast to classical Malliavin calculus (see [25]), 
the product formula in the discrete set-up carries the additional term 

— {X k /y/p k q k ){F> k F) ( D k G ), 

which is not present in the continuous framework. A similar effect also happens on the Poisson 
space, cf. [26] and the references cited therein. 

The iterated discrete gradient D n F := (D^ kn F) kl ,...,k n £N of order n > 2 is defined by 

D kl ,...,k n F '■= D kn( D ku 1 ..,k n - 1 F ) for *1> • • •, kn G N, where we put D\.F := D k F . 

We now present a formula which allows to compute the kernels f n in a chaotic decomposition 
as in (2.2). In the framework of classical Malliavin calculus this is known as Stroock’s formula. 
Since we have not found such a result for general Rademacher functionals in the literature, 
we provide the detailed arguments (for the special symmetric case see Lemma 2.2 in 14 and 
Section 2.4 in |24l). 


Proposition 2.1 (Stroock’s formula). Assume that F G F 2 (Vl) has chaotic decomposition 

F = E[ F } + En=lMfn 


and 


Jn(fn). 

Then for every n 

GN it holds that 



MDl,. 

.. >fen F] = E[py fel . 

■■Y k J, 

(ki,■■■ 

i l^n) ^ ^n ? 

(2.5) 

nD n kl „ 

.. ,kn F ] = nlfnih, 

■■ ■j kn) i 

(ki,--■ 

,,k n )eN n . 

(2.6) 


Proof. We start by proving 
(|2.4) yields 


by induction. Choosing G = Y k with D k G = D k Y k = 1 in 


D k { F Y k ) = {D k F )Y k + F - 


X, 


VPkQk' 


-DkF 
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and hence 


D k (FY k )Y k = (D k F)Y k + FY k - 


X k Y k 

VPkQk 


D k F. 


(2.7) 


It immediately follows from (2.3) that, for every F E L 2 (fl), D k F is independent of X k . 
Therefore, by taking expectations on both sides of (2.7) and computing E [X k Y k ] = 2^/p k q k , 
we get 

0 = E [D k F\ + E[FYfc] - 2E [D k F] = E [FY k ] - E [D k F] , 
which proves (2.5) for n = 1. Now, assume that ( |2.5[ ) holds for some fixed n E N and consider 

EK+T n+1 F] = E [D kn ^(Dl . t „F)]. 

From the case n = 1 treated above it follows that 

E [ D ;.fb» + . F ]=. 

and since Y kn+1 behaves like a constant from the point of view of D k kn , our assumption 
leads to 

E [ D Eb„*. F ] = = E[ FY kl ■ ■ -n„„], 

which concludes the proof of (2.5). Identity ( |2.6[ ) then immediately follows from (2.2) for 
(k \,..., k n ) E A n . For (k ±,..., k n ) £ both sides of (2.6) are equal to zero. □ 

For the rest of this section, let F E L 2 (n) have chaotic decomposition F = E[F’]+^)^ =1 J n {f n ) 
with kernels f n E ^(N) 071 for n E N. Since D k F E L 2 (Q) for every k E N, the discrete gradient 
also has a chaotic decomposition. Note that the kernels of this decomposition can be deduced 
from the chaotic decomposition of F using Stroock’s formula. More precisely, the n’th kernel 
of the chaotic decomposition of D k F evaluated at (Aq,..., k n ) E N n is given by 


k ETC..t„(Ot F )] = A = (" + l)/n +l(fa, • • • , kn, k) 


in+1 

JLL/I jL 

n\ " ■ - 77,! 

Thus, the discrete gradient can be written as 

oo 

D k F = ^2 nJn-ik)), (2.8) 

n =1 

where /„( • ,k) E ^(N) 071 ^ 1 denotes the kernel f n with one of its components fixed, thus acting 
as function in n— 1 variables. For F E L 2 (I7) as above and m E N, we say that F E doni(i7 m ), 
if 

E [l|O m F|ll(N)«»] = £ 1 


77/! \ 

(n-m)l ) ( n - m ) ! ll/n||l 2 (N) ®„ < °o. 


Next, we define the Ornstein-Uhlenbeck operator L and its (pseudo-)inverse L 1 . The domain 
of L is the class of all F E L 2 (Q) for which 


E[(LF) 2 ] = ^n 2 n!||/ n || 2 2(N)8n 


< oo. 


n=1 


LF:=-J2 nJ M- 

n=l 


For F E dorn(L) we put 
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The discrete Ornstein-Uhlenbeck semigroup (Pt)t >o associated with L is defined as 

oo 

PtF :=J2 e ~ ntj n(fn), t> 0. 


(2.9) 


n =0 


The properties of this semigroup will be discussed in detail in Section [3] below. Moreover, for 
centred F G L 2 (Vl) we put 

oo 1 

L~ X F :=-J2-J n (fn) 

Z —' n 


n= 1 


and call L 1 the (pseudo-)inverse of the Ornstein-Uhlenbeck operator L. 

Furthermore, we introduce the discrete divergence operator 5 and its domain dom(<5). For 
u := (u k ) keN G (L 2 (H)) N with 

OO 

^k := 'y ^ Jn{.9n+ 1( ' > ^)) j 

n =0 

where g n +\ G Ko(N) on ® W for n G N, we say that u G dom(<5), if 


^(n + l)!||c/ n+ il A . 


n =0 


n+1 IU 2 (N)<g>rH-l 


< oo. 


( 2 . 10 ) 


Here and in the following f(k\,... ,k n ) := ^E<r/(^(i) l-,,| ^w) denotes the canonical 
symmetrization of a function / in n variables, where the sum runs over all permutations a of 
the set {1,..., n}. 

For u G dom(<5), the discrete divergence operator is defined as 

OO 

5{u) := ^ J n +i(g n +i 1a„_|_i) • 

?i=0 


Note that, for u G dom(<5), (2.10) is equivalent to 

E[5(u) 2 ] < oo. 

As the adjoint of the discrete gradient, 6 satisfies the integration-by-parts formula 

E[F5(u)} = n(DF,u) eHN) ] (2.11) 

for F G dom(H) and u G dom(<5), cf. [30, Proposition 9.2], The operators D, L and 5 are 
related by the identity 

— 5D = L . (2.12) 

In this paper, we make use of the following crucial consequence of ( ]2. 11 ) and ( 2.12[ ). If 
/ : M — >• M is measurable and F G L 2 (H) is centred with f(F) G dom(H), then 

E[Ff(F)} = E [{Df{F), —DL~ 1 F)p^] . (2.13) 


Indeed, using (2.11) and (2.12) we have that 

E [Ff(F)} = E [LL-'FfiF)] = E [-5DL' 1 Ff (F)\ = E [(Df(F),-DL^F)^] . 

Now, we present an analogue of the integration-by-parts formula ( 2.1 1[ ) for functionals F G 
L 2 (Q) that do not necessarily belong to dom(H) (we refer to Lemma 2.2 in 17 for a related 
result on the Poisson space). 
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Proposition 2.2. Let F G L 2 (Ll). Furthermore, let u := (itfc)fcgN G (P 2 (12)) N with 

OO 

:= ^ ^ «^n(3n+l( * 5 ^)) ? 

72=0 

w/iere g n+ i G ^o(N) on ® ^ 2 (N) for n G N and 

OO 

E(-+ X ) ! lls'n+lll^)® „+l < OO . 


(2.14) 


72=0 


Further assume that ( D^F)uk > 0 P-almost surely for every fceN. T/ien u G dom(5) and 

E[P5(tt)] = E[(P>F, n) £2(N) ]. (2.15) 

Proof. Note that (2.14) implies ( 2. 10[ ) and hence it G dom(<5). Since F G L 2 (Q), it can be 
represented as 


F = E ^(/n) 

72=0 

with kernels /o := E[F] and f n G ^g(N) on for n G N. The isometry in (2.1) yields 

OO \ / OO 

E[^(«)] =E [(E J n(/n)j ( E Jn+l(g^ltA n+1 )} 

72=0 ^ ^ 72—0 

OO 

= ^ + l)K/n+lj 5n+l lA n +i )t 2 (N)® n + 1 

n=0 
oo 

= E^ -*-)K/n+li5r)+l)£ 2 (N)®"+ 1 • (2-16) 


n =0 


Note that the last step in (2.16) is valid, since, for every n G N, f n is symmetric and vanishes 
on diagonals. 

Since (DkF)uk > 0 P-almost surely for every k G N and by the isometry formula for discrete 
multiple stochastic integrals, we get 

OO 

E[(PP, u)^2( N )] = Y®[(D k F)u k \ 

k= 1 

OO OO oo 

= E® ( E( n + F)J n {fn+ 1( • ) fc))) ( E Jn{Sn+ 1( ‘ ; h 

k= 1 n=0 n=0 

oo oo 

= E E( n l)K/n+l( ' > &)) 5 n+l( ■ J ^)}£ 2 (N)®" 

k= 1 72—0 
oo oo 

= E(" +1 > ! E<a+a i &)> 5ra+l( ■ , ^)}^ 2 (N) 

72—0 fc=l 

OO 

= E(«+ 1 )'(/n+li5 f n+l)^2( N )®n+l . 


'iigln 


72—0 


(2.17) 
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Note that the exchange of summation in the penultimate step of (2.17) is valid by Fubini’s 
theorem, since a repeated application of the Cauchy-Schwarz inequality yields that 

OO OO 

EE | (n + 1) ! (fn+l 9n+l ( • , £ 0 )£ 2 (N) ®n I 

n=0 k= 1 

OO OO 

< + ||/ra+l( • 5 &)||^ 2 ( N )®n ||ffn+l( • > ^)l|£ 2 (pj)®n 


n=0 

OO 


k =1 

oo 


< ^(n + l)!(^||/ n+ i(-,A;)||^ (N)S 

n=0 k= 1 

oo 

= ^(n + 1)! ||/n+l||f2(N)®«+i l|5n+l||^2(N)®' 


1 OO 

2 


E»»”+■(■ 


k =1 


'71+1 


71—0 
OO 


< 


< 


1 y oo 

2 


TXn + 1)! ||/n+l||^2( N )®n+i j ( + 1)! Ilffn+l llfap^j® 


71—0 


71—0 


(E[F 2 ])i(^( n + !)• ||<7ra+l||£2(pj)®n+l 


n =0 


< OO 


Comparing (2.16) and (2.17) completes the proof. 


□ 


Finally, we recall the following Skorohod isometric formula for the discrete divergence opera¬ 
tor. Namely, for all u G dom(<5) it holds that 


OO 

E[<5(u) 2 ] = E[|M| 2 2(n) ] +E [ ^2 ( D k Ui)(D e u k ) 


(2.18) 


k,e =i 


according to Proposition 9.3 in 30 


3. The discrete Ornstein-Uhlenbeck semigroup 
For real t > 0 define the random sequence X 1 := (Xfc)/.eN by 

X k '■= X k ^{Z k <t} + X k ^{Z k >t} , 

where {X^k^n is an independent copy of the Rademacher sequence X = ( Xk)keN and 

is a sequence of independent and exponentially distributed random variables with mean 1, 

independent of all other random variables. 

Our first result is a discrete analogue of Mehler’s formula on the Wiener or Poisson chaos 
for which we refer to |20 and 17 , respectively. It expresses the discrete Ornstein-Uhlenbeck 


semigroup (Pt)t >o defined at (2.9) in terms of a conditional expectation. Note that this has 


already been shown in 30, Proposition 10.8]. Since Mehler’s formula is a central device in 
our approach, we include an elementary and direct proof. 

Proposition 3.1 (Mehler’s formula). Let F G L 2 (Ll). The process (X f )t>o is the Ornstein- 
Uhlenbeck process associated with (Pt)t >o by the relation 

P t F = E[F{X t )\X] P-a.s. 


for all t > 0. 
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Proof. We first notice that for each t > 0, (^)fceN is a sequence of independent Rademacher 
random variables with the same distribution as the sequence (Wfc)fceN- Thus, if F = E[F] + 
J n {fn), then F(X t ) has chaotic decomposition 

OO 

F(X*) = E[F] + E> (3.1) 

n= 1 l<ii<...<in<oo 


where both decompositions share the same kernels f n E £q(N) oti , for n E N, and where the 
sequence (Y^fceN with 

Ffc := (-^fc — Pk + Qk)/ (2 y/PkQk) = E fc * l{Z fe <t} +Ffc ^-{Z k >t}i (3-2) 


for t > 0, is the normalization of the sequence Here, the random variable Yj* is 

the normalization of X* k for every k E N. Using the independence of the sequences (X k ) ke ^, 
( X k )k£ N and (Z k ) ke N we deduce from (|3.2[) that 


E[Tfc I x k ] = E[y fc *] • P{Z k <t) + Y k ■ P(Z k >t) = Y k e 


-t 


For a functional F ( i only depending on the first d Rademacher random variables we compute 
by using the chaotic decomposition in (3.1) as well as linearity and independence, 


E[F d (X*,...,X*)|X] 


d 


E[F d (Ai,.. 

-,Xd)] 

+ E n! 

E in) E \yt I X h ] • • • E [Yl I XJ 



n= 1 

l<il<...<i n <d 

E[F d (Ai,.. 

■,x d )] 

d 

+E«- 

- nt n\ E fP(h,...,in)Y il ---Y in 



n= 1 

l<i\<...<i n <d 

E[F d (Ai,.. 

-,Xd)] 

d 

+e«- 

n — 1 

- nt Uti d) ) 

PtF d (X i,.. 

.,x d ). 

Tl — 1 

(3.3) 


The general case follows from (3.3) due to the fact that the set of functionals depending only 
on finitely many Rademacher variables is dense in L 2 (Q) and that both sides of (3.3) are 
continuous functions of F d . □ 


As a next step, we derive an integral representation for the expression —D m L 1 F, i.e., the 
m-fold iterated discrete gradient applied to —L~ 1 F. 


Proposition 3.2. For m, k \,..., k m E N and centred F E doin(F m ) one has that 


-D 


ki,...,k 7 


L~ X F = 


—mt 


PtD'k[ 


Fdt 


P-a.s. 


Proof. Since F E L 2 (H) is centred, there are kernels f n E t?o(N) on , n *= N, such that 
F = Y^=i J n(fn)- Fix d E {m,m + 1,...} and consider the truncated functional F d •= 







DISCRETE MALLIAVIN-STEIN METHOD AND APPLICATIONS 


13 


En=l J n{fn)- Then, 

~ D h .= X 


(n — 1)! 
(n — m)\ 
d 


Jn—m(fn( k m )) 


n=m 
r oo 


—mt 


n\ 


y 

■y (n — m)! 


e-( n ~ m »J n _ m (f n ( -.A*,..., fc m )) dt , (3.4) 


where we used that J 0 °° ne~ nt dt = 1. By continuity of .Dj™ fcm and L _1 one has that 
—.Dj™ km L^ 1 Fd converges to — .D]™ k ^L~ l F in L 2 (12), as d -A oo. To show that the right 
hand side of (3.4) converges to 


/ e- mt PtDk u ..,k m Fdt 

Jo 

in L 2 (0), as d —> oo, we consider the remainder term 

/»oo 

:= / e- mt P t D^_ km Fdt - (-D^^L^F,) 


and show that E[i? 2 ld ] vanishes, as d —>• oo. First, use (3.4) to see that 


Rm,d — 


—mt 


E 

n=d+l 


n! 


(n — m)! 


e-^-m) t J n _ rn (f n (.,k l ,...,k m ))dt. 


We then apply Jensen’s inequality, Fubini’s theorem and the isometry property of discrete 
multiple stochastic integrals to conclude that 


= E 


< E 


—mt 


E 


n\ 


n=d -\-1 


(n — m ) 


y^-^Jn-miU k m )) dt) 


2i 




L J 0 

oo 


E 

n=d+l 


n! {n _ m)tj m ^ km ))) 2 dt 

{n — m)\ / 


D —(2m—l)t 


oo | 2 

X] i t U ' m ) e _2 ^ n_m)< (n — m)!||/ n ( ■ ,k±,..., k m )\ 
■' \(n — my.J 


< 


(n — m )! 

00 / | \ 2 

\^( n _ m )! J (” _ m )' ll^"( ' ’ • ' • ’ ^ m )||f2( N )®(n-m) ) 


n=d -\-1 
2 


^2(^)®(n —m) 


dt 


n=d -\-1 

where we used that / 0 °° e - ^ 271-1 ^ dt = (2 n — 1) _1 < 1. Since F € dorn(D m ), the latter 
expression is finite and converges to zero, as d —> oo. This concludes the proof. □ 

Our next result combines the previous two propositions and is one of the key tools in the proof 
of our general Berry-Esseen bound in Section [4| Similar relations also hold on the Wiener and 
the Poisson space for which we refer to 123 and 17 , respectively. Although from a formal 


point of view the statement looks similar to these results, we emphasize that the proof as 
well as the meaning and the interpretation of the involved Malliavin operators in our discrete 
framework are different. 
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Proposition 3.3. For k m G N, a > 1 and centred F G dorn(.D m ) one has that 


Proof. According to Proposition 3.2, we have that 

POO 


n\D^_ km L-^n = E 


er mt P t D m 


i o 




Fdt 


Then, using Proposition 3.1 together with Jensen’s inequality, we deduce that 


E 


e-mipDf Fdt 


= E 


< E 


L J o 

OO 


e~ mt E[D^_ km F(X t )\X]dt 


e- mt nD^_ krn F{X t )\ a \X]dt 


e-^nDZ,...,k m F\ a \dt 


< E[| D£_ km F\< 


and complete the proof. 


□ 


As a first application of Proposition |3.3| we now deduce the following discrete Poincare-type 
inequality. This result can already be found in 30, Chapter 8], where it is proved by means 
of the Clark formula. We present an alternative proof without resorting to this formula. 

Proposition 3.4. Suppose that F G dom(H). Then 

Var[F] < EtpEH^)]. (3.5) 


Proof. Choosing / in (2.13) as the identity map on M yields 

\2 


Var[F] = E[(F - E[F]) 2 ] = E [(D(F - E[F]), -DL~ l {F - E[F]))^ 2(N) ] 

OO 

E [£(L> fc (E - E [F})){-D k L~\F - E[F])) 


k =1 

OO 


< ® [£ | D k (F - E[F])| I D k L-\F - E[F])| 


k =1 


Exchanging expectation and summation, and using the Cauchy-Schwarz inequality, we see 
that the latter expression is further bounded by 


£ {n(D k (F - nF})) 2 ]) 1/2 (E[(D k L-\F - E[F])) 2 ]) 


2 i \ !/2 


k =1 


The proof is now concluded by applying Proposition 3.3 with m = 1 and a = 2 and using the 
fact that D k (F — E[P]) = D k F. □ 


Remark 3.1. Proposition 3.4 remains valid for F G L 1 ( Q) \ dom(H), since in this case the 
right hand side of (3.5) is infinite. 


























DISCRETE MALLIAVIN-STEIN METHOD AND APPLICATIONS 


15 


4. A GENERAL BERRY-ESSEEN BOUND 

The main result of this section is a Berry-Esseen bound for square-integrable Rademacher 
functionals F. By such a result we mean an upper bound for the Kolmogorov distance between 
F and a standard Gaussian random variable, where we recall that the Kolmogorov distance 
between two random variables X and Y is defined as 


d K (X, Y) := sup \ P(X < x) — P(Y < x)| . 


A first result in this direction has been shown by the authors in [14 in the special symmetric 
case that the sequence p = (pk)k £N is constant and equal to 1/2. In the present situation, we 
need the following generalization to arbitrary sequences p. Since the proof follows straight¬ 
forwardly along the lines of that of Theorem 3.1 in 


14 , we omit the arguments. 


Proposition 4.1. Let F G dom(D) with E[E] = 0 and let N ~ A7(0,l) be a standard 
Gaussian random variable. Then, 


d K (F, N ) < E[|l - (DF, -DL-'F)^] + ^ E [{{pq)~ l /\DF?, \DL^Fl)^] 

+ ^ mPq)~ 1/2 (DF) 2 , | F • DL-'Fbewl 
+ supE [((pq)~ 1/2 (DF)D |T>L~ 1 F|) £ 2 (N) ]. 


One disadvantage of the bound in Proposition |4.1| is that it involves the inverse of the discrete 
Ornstein-Uhlenbeck operator. In applications this means that the chaotic decomposition of 
the Rademacher functional F has to be computed explicitly in order to evaluate the expres¬ 
sion —DL~ 1 F. A further analysis of the bound then requires a multiplication formula for 
discrete multiple stochastic integrals, which expresses a product of two discrete multiple sto¬ 
chastic integrals as linear combination of discrete multiple stochastic integrals. We transfer 
the bound of Proposition 4.1 into a form, which can be evaluated without using a multi¬ 
plication formula. Our next result is a combination of Proposition 4.1 and Proposition 3.3 


and provides an upper bound for dx(F,N ) in terms of the first- and second-order discrete 
gradient only. A result of this structure is what is called a ‘second-order Poincare inequality’ 
in the literature, see [3,17,23 


The discrete Poincare-type inequality in Proposition 3.4 


says 

that a Rademacher functional F is concentrated around E [F] in terms of the variance if the 
contribution of the first-order discrete gradient is small. Our discrete second-order Poincare 
inequality additionally implies that if the contribution of the second-order discrete gradient 
is also small, then F is close to a standard Gaussian random variable. 


Theorem 4.1. Let F G dom(H) with E[F] = 0 and E[E 2 ] = 1, and let N ~ AA(0,1) be a 
standard Gaussian random variable. Further, fix r,s,t G (l,oo) such that £ + ^ \ = 1. 
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Then, 


d K (F,N )<(^ £ (E [(D ] F)\D k F) 2 ]) 1 /\E[(D l D,Fj 2 (D £ D k F) 2 }) 1 ^ 


1/2 


j J k i £=l 

3 A i 


+ (1|, ^(wp^n) 1 ' 2 wn 

°o 

+ 2( E [i F r]) 1/ ’'E^^( E [i D ‘ F i 2 ']) 1/ '< E [i i3 ‘ F i‘]) 1/ ‘ 

oo < 1/2 I 

+ (E — e [(a^) 4 ]) + ( 6 E — miDkFf^mDiDkF) 

' . PkQk ' ' . - T/hUh 


k=l 

oo 


+ »E 


1 


M=i 

1/2 


PkQk 


1/2 


M=1 


PkQkPm 


E [(D e D k F) 4 )]) . 


Let us comment on the second-order Poincare inequality in Theorem 4.1 Its form differs 
from that in the Wiener or Poisson case treated in 01 231. The main difference is the fourth 
term, which involves the parameters r, s and t, and hence higher moments of F and D k F. 
In many applications one can choose r = 2 and s = t = 4, but there are situations in which 
the additional flexibility to choose r, s and t differently turns out to be crucial. We shall 


meet such an example in the proof of Theorem 1.1 on the triangle counting statistic in the 
Erdos-Renyi random graph. 


Proof of Theorem \4.1\ Our proof follows the general scheme to establish a second-order Poin¬ 
care inequality, which is used in the literature [3,17,231. Namely, we build on Proposition 4.1 


by further estimating each summand of the bound there. We start with the first summand, 
to which we apply the Cauchy-Schwarz inequality: 

E[|l - (DF, -DL~ l F) fiw \] < (E[(l - (DF, -DL^F)^) 2 }) 1 / 2 . 


Taking / as the identity on M in (2.13) shows that E [(DF,—DL~ 1 F)p^\ = Var[P] = 
1. Thus, E[(l — [DF,—DL~ l F)pj^\) 2 ] = Va.r[(DF, — DL~ 1 F)^^] and an application of 
Proposition |3.4| (see also Remark 3.1) yields 


E[(1 - {DF,-DL- l F) P[n) ) 2 ] < n\mDF,-DL- l F) PW )\\l m ] 

OO OO 

= e [ E (M E^ F )(-^ L " lF 


i=i 

oo 


k =1 


= E [ E ( E Dl{{D k F){-D k L~ x F 


2i 


2i 


(4.1) 


1=1 k =1 


where the exchange of Dp with the summation in the last step can be justified as follows. Since 
~E[(DF, — DL~ 1 F)p^] = 1, {DF,—DL~ l F)( 2 ^ is P-a.s. finite. Thus, (DF^, —DL~ l F^)p^ 
as well as the path-wise representation of D^{{DF, — DL~ 1 F)p^) as at (|2.3|) are P-a.s. finite 
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for t G N. As a consequence, we see that 

Di((DF,-DL^ 1 F) i 2( N )) = y/peqe({DFg~, -DL~ l F^) - (DFf,-DL 1 Ff) e 2( N )) 


= VPeQi 


k =1 
oo 


D k L~ l F+) - Y / (D k Ff)(-D k L- 1 F i 

k =1 

X^^((F fc F+)(- J D fc L- 1 F+) - (DkFfX-DkL-'Ff)) 

k=1 

OO 

^ Di((D k F)(-D k L~ 1 F)) P-a.s. 


k =1 


for f £ N. Now, we further estimate the quantity D £ ((D k F)(—D k L 1 F)) in (4.1) using the 
product formula \2A\. This yields 


|Z),((Z4T)(-Z} fc L- 1 F))| 


= |(£>*£> fc F)(- J D*L- 1 F) + (D k F)(-D £ D k L~ l F) - 


-1 ; 


Xo 


(■ DiD k F){-D t D k L- l F)\ 


\/» 


< |( J D €J D fe F)(- J D fc L- i F)| + \(D k F)(-D e D k L~ l F)\ + 


1 




|(D £ H fc F)(-Zl^ fc L- i F)|. 


Using this together with the Cauchy-Schwarz inequality, it follows from (4.1) that 
E[(l - (DF, —DL~ 1 F)p^) 2 } < 3(Ti + T 2 + T 3 ), 
where T\, T 2 and T 3 are given by 

2i 


(4.2) 


OO OO 


Ti := E [J2 {j2\(Dt D k F )(- D k L ~ lF 

1=1 k =1 
oo oo 

T 2 :=E (5Dl(£>fcE)(-£) £j D fc L- 1 F 

£=1 /c=l 

OO 1 oo 

r 3 := E [ V-( V |( J D £j D fc i ? )(-^fcA- 1 F 


2~i 


£=1 fc=l 

Each of these terms is now further estimated from above. Considering Ti, an application of 
Proposition 3.2 and Proposition 3.1 as well as Jensen’s inequality yields for f 6N that 

\D £ D k F\\D k L- L F\J 2 

' k=1 


OO 2 OO n 

22 \ D < D k p \ \D k L~ l F |) = ( 22 \D £ D k F\ | / 

i —i z—i </0 


fc=l 

oo 


*00 v o 

—t i 


< 


e PtD k F dt 

^ /*oo 

V I/ e-‘ E[L> fc F(X*) | A] dt 

k =i ^ 

oo /*oo 2 

(^|AL» fc F| / e^EOFfcF^llAjdt) . 
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By virtue of the monotone convergence theorem, we get for t G N that 

~ 2 


^ POO 

X \D t D k F\ / e _t F,[\D k F(X t )\ \ X } 

k =i Jo 

a oo 00 

e"* X E[|T> fc E(X f )| | X] dt 

OO 

[X^A^I-D^x*; 


e _t E 


Using Jensen’s inequality again as well as the Cauchy-Schwarz inequality, we now conclude 
for i € N that 

OO 


e“*E 


k =1 




dt 


< / e E 
Jo 




dt 


[XlAA^IlAA^)l 

' 1 

OO 

X! I AAE| |L» fc E(X 

fc=i 

/ e"* E V \D e DjF\ \DjF(X t ) \ \D t D k F\ | A^POl 

/o .X i 

oo /»oo 

X l-Df-Dj-F’l | AA-^I / e- 4 E[| J D j F(X t )| | AA^)I I *] dt 


X 


dt 


j,k =1 

OO 


< X 1^*1 I / e- t (E[(DjF(X t )) 2 (D fc F(X t )) 2 | X]) 1 / 2 dt 

i, fc=i • /il 

< J2 \ D tDjF\\D t D k F\( e _t 'E[(D j F(X t )) 2 (D k F(X t ))‘ 2 \ X] dtj . 

j, k =i - /o 7 

Thus, another application of the Cauchy-Schwarz inequality leads to the bound 

Ti < E V ID^DjFI |D^D fc F| ( / e"* E[(D jJ F(X t )) 2 (£) JfcJ F’(X t )) 2 | X] dt) 

i//=i 'Wo 7 J 

00 r /*oo . 

< X (E[(^-D i E) 2 ( J D £j D fc E) 2 ]) 1 /2( E [ / e -*E[( J D i F(X t )) 2 ( J D fc F(X t )) 2 | X] dt]) 

j,k ,£=1 

X (E[( J D^F) 2 (^ J D fc F) 2 ]) 1 / 2 (^°° e- t E[(d/,F) 2 ( J D fc F) 2 ] dt X V2 


1/2 


oo 

= X (E[(T>^ i E) 2 (A J D fc E) 2 ]) 1 /2(E[(dJ i F) 2 (A.E) 2 ]) 1 / 2 . 

j,k,£=l 

Using similar arguments and Proposition |3.2| for m = 2, one shows that 
1 ~ 


(4.3) 


T 2 < t X (E[(^E) 2 (Z/ fc F) 2 ]) 1 /2(E[( J D £j D J F) 2 (iJ,dJ fc F) 2 ]) 1 / 2 (4.4) 


j,k,£=l 
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and 


1 00 1 

A<J E — E[( D t D,Ff(D t D k Ff 




Thus, combining (4.3), (4.4) and (4.5) with (4.2), we get 

E[|1-(DF,- DL- l F) Pm \\ 


«T 


LAJ 

52 mD j F)\D k F) 2 ]) 1 / 2 (n(D £ D j F)\D e D k F) 2 }) 1 ^ 


1/2 


j,k,£= 1 

00 


'3 

+ l 'b£Si Mf 


E — E[(DiD j F) 2 (DeD k Ff 


1/2 


(4.5) 


(4.6) 


as an estimate for the first summand of the bound in Proposition 4.1 
For the second summand we obtain 


E [((pq)- 1 / 2 (DF) 2 , | DL-'FDp^} = ^fe ^)" 1 / 2 E[ (j D fc T ) 2 | D^F]} 

k= 1 
00 

<X^(Pfe%)“ 1/2 ( ] E[| J DA ; E| 3 ]) 2/3 (E[|L» jfc L- 1 F| 3 ]) 1/3 

fc=l 

OO 

<E^-)“ 1/2e [I^ f i 3 ] ( 4 - 7 ) 

fc=l 


by means of Holder’s inequality with Holder conjugates 3 
Applying a generalization of Holder’s ine quality with Holder conjugates r, s, t e 
^ + | + j = las well as Proposition 


4.1 immediately yields 


3.3 


and 3/2, and Proposition 3.3 
(l,oo 


with 

to the third summand of the bound in Proposition 


E[((pg)- 1/2 (HF) 2 , | F • DL-'Fl)^} 

OO 

= X>fc%)~ 1/2 E[|F| (D k F) 2 \D k L~ l F\] 

k =1 

OO 

< (E[|En) 1/r ^(^g fc )- 1/2 (E[| J D fc E| 2s ]) 1 / s (E[| J D fc L- 1 E|*]) 1 /* 

k =1 

OO 

< (E[| J Fn) 1/r ^(Pfcgfe) _1/2 (E[|T)fcF| 2s ]) 1/s (E[|T) fc F| t ]) 1 / t . (4.8) 

fc=i 


We now ap ply the integration-by-parts-formula (2.15) in order to bound the last term in 
Proposition |4.l| To this end we note that D k _ DfcF|.DfcL~ 1 i ? | > 0 for every k E N and 


we need to verify the summability condition in (2.14) in Proposition 2.2 The latter will be 
verified subsequent to the following calculation. Using the integration-by-parts-formula we 
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obtain that 


mPQr 1/2 (DF)Dl {F>x} ,\DL- 1 F\) P{N) ] = -E[(Dt {F>xy ,(pq)- 1 / 2 (DF)\DL- 1 F\) P{N) } 

= E[l {F>a;} ,5 ((m)- 1 / 2 ( j DF)| j DL- 1 F|)] 
<E[|5(( m )- 1 / 2 ( j DF)| j DL- 1 F|)|] 

< (E[( < 5((m)- 1 / 2 (DF)| j DL- 1 F|)) 2 ]) 1 /2 . (4.9) 


From the isometric formula (2. 18|) for the divergence operator it follows that 


E[(5((pg)- 1/2 ( J DF)|Z)L- 1 F|)) 2 ] 

= E[||(pg , )- 1 ^ 2 (-DF)(F>L _1 F ) || 2 2(N) ] 

00 

+ E [ ^ {p,q ( )- l / 2 D k ({D,F)\D,L- l F\) • (p fc(?fc )" 1/2 ^((F fc F)|F fc L- 1 F|)‘ 
k,i= 1 

00 

< E[||(P9)- 1/2 (FF)( J DL- 1 F)|| 2 2(N) ] +E [ (PkQk)~ 1 (De((D k F)(D k L~ 1 F))) 2 

k,£=l 

=:T 4 + T 5 . (4.10) 


The term T 4 can easily be estimated by means of the Cauchy-Schwarz inequality and Propo¬ 
sition 3.3, which yields that 


OO OO 

T 4 = ^( M )- 1 E[(F fc F) 2 (F fc L- 1 F) 2 ] < ^( mfc )- 1 (E[(F fc F) 4 ]) 1 / 2 (E[(F fc L" 1 F) 4 ]) 1 / 2 
k =1 k =1 


< E[(£>*F) 4 ]. 

k =1 


(4.11) 


To handle T5, we first compute E[(F ( >((Z1 ( ! C F)(Z1/ C T 4 F))) 


(2.4), the Cauchy-Schwarz inequality as well as Proposition 3.3 


by using the product formula 
This leads to 


E [(DtdDkFHDkL-'F))) 2 ] 

= E [((D,D k F)(D k L~ l F) + (D k F)(D e D k L~ 1 F) - (pmY^X^DtDkF^DiDkL^F)) 2 } 

< 3E [(DiDkFf^DkL-'F) 2 ] + 3E[(D k F) 2 (D£D k L~ 1 F) 2 } 

+ 3 (pm)- 1 E [(D.DkFfiD.DkL^F) 2 } 

< 3 (E[(F,F fc F) 4 ]) 1 / 2 (E[(F» fc L- 1 F) 4 ]) 1 / 2 + 3 (E[(F fc F) 4 ]) 1 / 2 (E[(F,F fc L- 1 F) 4 ]) 1 / 2 
+ 3 (pmr\n(D e D k F) 4 ]) 1 / 2 (n(D £ D k L- 1 F) 4 ]) 1 / 2 

< 6 (E[(F fc F) 4 ]) 1 / 2 (E[(F £ F» fc F) 4 ]) 1 / 2 + 3 (pm)' 1 E[(F»,F fc F) 4 ] . (4.12) 


We now justify the validity of the summability condition (|2. 14). Assume that 


( D k ueY 

L k,£=l 


E 


< 00 , 


(4.13) 
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where up := (ppqp) 4 / 2 DpF\DpL 1 i ? | = Jn(dn+ 1 ( ■ ,£))■ Then we obtain that 


E 


X ] ( A «/) 2 = E E [( A «<) 2 

k ,£= 1 ■* fc,£=l 


EE- l|Sn+l( ' j ^!^)II^ 2 (N)®"-i 

k/=ln =1 
oo 

) A i ||ffn+l ||^2(pj)® n +i ’ 

n= 1 


which implies that 


OO OO 

XX 71 + l) 77 ! llfl , n+l|| 2 2 ( N )®n+l - 2 E X3 {D k upf 


n =1 


M =1 


< OO . 


Thus, the summability condition (2.14) is verified, once condition (4.13) is satisfied. Since 


T 5 = E [ Y^k P=\ (A?h?) 2 ] i condition (4.13) is verified, once our error bound is finite. Other¬ 
wise, the error bound holds trivially. Combining (4.9), (4.10), (4.11) and (4.12) yields 

supE [((pqy 1/2 (DF)Dt {F>x} , |T»L _ 1 F|) £ 2 (n) ] 


irEM 


< 


^( M )- 1 E[(L> fc F) 4 ]) 1/2 + (e X] fe%)- 1 (E[(T> fcj F) 4 ]) 1 / 2 (E[( J D^ fe F ) 4 ]) 1 / 2 


fc=l 


OO - 

+ (3 V -E[(D<D*F) 4 ]) 

V z — VhQhVeQp ' 


k,£=l 
1/2 


^ =1 PkQkPiQi 
This concludes the proof. 


1/2 

(4.14) 

□ 


5. Application to the Erdos-Renyi random graph 
and proof of Theorems [EDO and |1.3| 


In this section we apply Theorem 4.1 to counting statistics associated with the Erdos-Renyi 
random graph and establish thereby Theorem B Theorem |1.2| and Theorem |1.3[ First, we 
formally introduce the model and fix some notation. For n E N and a real number p E (0,1), 
let Q be the set of all simple and undirected graphs with vertex set [n] := {1,... ,n}. We 
consider the probability space (G,V(G), P), where V(G) is the power set of Q and P is the 
probability measure given by 

P (G) = p e{G) (l -p)(V )~< G ), 

where for G E G, e(G) denotes the number of edges of G. The identity map on Q is called 
the Erdos-Renyi random graph and is usually abbreviated by G(n,p). We refer to the book 
[TO] for a detailed account of the theory of random graphs. 

We are interested in the number T of triangles in G(n,p), that is the number of sub-graphs 
in G(n,p), which are isomorphic to the complete graph on 3 vertices. To analyse the asymp¬ 
totic behaviour of this random variable, we typically allow p to depend on n. Following the 
literature and to simplify the notation we will often suppress the dependence on n of several 


(random) variables. In order to apply Theorem 4.1 to the normalized triangle counting statis¬ 


tic F := (T-E[T])/ v / Var[T], we first have to embed the model into the framework of Section 
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[ 2 ] and Section |4j If one labels the Q) edges of the complete graph on n vertices in a hxed 
but arbitrary way, G(n,p ) can be regarded as an outcome of (") independent Bernoulli trials, 
with success probability equal to p. Here, success in the Atth Bernoulli trial means that the 
fc’th edge is visible in G(n,p). Hence, G(n,p ) can be identified with the vector (Xi,..., X^n\) 
of independent Rademacher random variables with parameter pk = p, where = +1 indi¬ 
cates that the edge with number k is visible in G(n,p). From now on, we fix an arbitrary 
enumeration of the edges in the complete graph on the vertex set [n], write I := { 1 ,..., Q)} 
for the set of labels and denote by e*,, k E /, the fc’th edge of the graph. 

Recall from Lemma 3.5 in jloj that 


Var[T] x 


0 5 n 4_5 “ 
$3 n 3(l -a) 


if 0 < a < \ 
if 2 < a < 1, 


(5.1) 


where we recall that the success probability is given by p = On a with a E [0,1) and 6 E 
(0,n“) such that 0x1. 


Proof of Theorem First, we notice that the assumptions of Theorem [Ll] are satisfied since 
F is normalized and only depends on finitely many Rademacher variables. 

To evaluate the bound in Theorem |4.1[ we have to control the random variables D^F and 
D^DjF for k, j E {1,..., ( 2 )}- We have 


D t F = vS(f+ - F k :) = - 7 ^=(T+ - T k ) 

Y Var[TJ 

and hence 

D k F = T* - T r . 

Now, we notice that T^f equals the number of triangles in the random graph G(n,p ) with 
visible, while Tjf equals the number of triangles in the random graph G(n,p ) when e*, is not 
visible. Thus, Tjf — Tff is the number of triangles that have edge ek in common, which shows 
that the random variable T^f — T^f has a binomial distribution Bin(n — 2 ,p 2 ) with parameters 
n — 2 and p 2 . This is a consequence of the fact that there are n — 2 possible triangles being 
attached to the fc’th edge and each of these triangles is a sub-graph of G(n,p ) with probability 
p 2 , independently of all other triangles. Hence, 



\/vgrj 


DkF ~ Bin(n 


2 ,P 2 )- 


Next, we consider the second-order discrete gradient and obtain that 


(5.2) 


DkDjF = ^E= D k (Tf - F 


whence 


yvrnlrj 


^Var [T] J 3 

- FK ~ (FK - FK)) 


DkDjF = (T*)* - (T+y - (( T~)t - (T~)- t ) . 


pq 
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The random variable (Tp^ — {Tp^ counts the number of triangles in G{n,p) adjacent to 
the fc’th edge e k , conditioned on the event that the j’th edge ej is visible in G{n,p). Similarly, 
(Tr)+ — (Tp^ counts the number of triangles adjacent to e k when ej does not belong to 
G(n,p). Therefore, (T+)+-- {(Tp+-(Tpp is the number of triangles with common 
edges ek and ej. The number of vertices shared by both edges e k and ej is \ek n ej |. Then, if 
\e k nej\ E {0,2}, (Tp+-(Tp~-((Tp+-(Tp~) = 0 and if \e k Ciej\ = 1, wehavee* = {r,s} 
and ej = {r, t} for some r,s,t E [re], s / t. In this case, {Tp^ - (Tp k — ((Tc)+ - (Tpp 
is either 1 or 0, depending on whether the edge {s,t} belongs to G(n,p ) or not. Thus, 


= 1 


y/Var[ T] d d f f~ Ber(p) if \e k n e 3 \ 

k 3 |=0 if \e k Gej\ E {0,2}, 


pq 


(5.3) 


where Ber(p) = Bin(l,p) indicates a Bernoulli distribution with parameter p. Note that the 
random variables DpD k F and DpDjF are independent whenever k ^ j. Indeed, fix £ and let 
k 7 ^ j, and suppose that \e k n ep\ E {0,2} or \ej n ep\ E {0,2}. Then DpD k F and DpDjF 
are independent, since at least one of them is equal to zero. Now, consider the case that 
\e k n ei\ = 1 and \ej n et\ = 1. In this situation, the three edges e k , ej, et can have the 
following form. Either 


e k = {s,t}, ej = {u,v}, ep = {t,u}, s / re, v / t , 


(5.4) 


or 


e k = {s,t}, ej = {u,t}, ep = {v,t}, s ^v,u^v. 


(5.5) 


In the situation of (5.4), we have {s, re} = e a and {t, v} = eb for some a, b E /, a / b and thus 

and 

pq 1 a I pq 


-DpDjF - l{A'i,=i} ? 


which implies the independence of DpD k F and DpDjF in this case. In the situation of (5.5) 
we obtain {s, v} = e a and {re, re} = for some a, b E I, a / b, and hence 


\/Var \T] v/VarlTl 

-DpD k F = l{x a =i} and - DpDjF = l{X(,=i} , 


pq 


pq 


which implies the independence of DpD k F and DpDjF in the second case. 

In view of (5.2) and the bound in Theorem 14.1| we need an expression for the fractional 


moments of a binomial random variable Z ~ Bin(re,p) with re E N and p E (0,1). It is well 
known that 


E [Z p ] 


(np) 13 if np —> oo 
np if np —> 0 , 


P e [l, oo). 


As a consequence, we deduce that for re E {3, 4 , ...}, a E [0,1) and 6 E (0,re Q ) with 0x1, 
the binomial random variable Z rs j Bin(re — 2, 0 2 n 2a ) satisfies 


E [ZP] 


QPnPi 1 2a ) if 0 < a < l 


On 


1—2a 


if \ < a < 1 


P E [1, oo). 


(5.6) 
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With (5.2), (5.3) and (5.6) at hand we are now prepared for the evaluation of the bound in 


Theorem 4.1 


The following terms have to be considered: 

Ai := ( 22 (n(DjF) 2 (D k F) 2 ]f/ 2 (E[(D £ D j F) 2 (D £ D k F) 2 


j,k,tei 


A 2 ■■= ( 22 -E [(D t DjF) 2 (D t D k F) : 


2 i \ 2 


j,k,£el 


pq 


4> : =E^ E H D **fi. 


k£l 


A 4 := (E[|E| r ])r 22 -j=<F[\DkF\ 2a ])‘ (E[|Z> fc if ])* , A 5 := E - E [(D k F) 

yjpq ' Tin 


k&I 


4i \ 2 


kei 


pq 


1 


A 6 :=(V (n(FkF) 4 jy/ 2 (E[(D £ D k Fy 


^ := ^(E e [(w 4 )])*> 

1. I)/— T 


k/ei 


where in A 4 , r,s,t G (1, oo) are such that ^ ^ | = 1. Let us begin with the term A\. Using 

the independence of D £ D k F and D £ DjF for k ^ j as well as the Cauchy-Schwarz inequality 
we obtain 


22 (E[( J D i E) 2 (D fc E) 2 ]) 1 / 2 (E[(^D,F) 2 (^L> fc F) 2 ]) 1 / 2 
j,k,£ei 

- 22 (E[( J D j E) 4 ]) 1 / 2 (E[(^^F) 4 ])i/ 2 


j,tei 


+ 22 (E[(^-F) 2 (T» fc F) 2 ]) 1 / 2 (E[(Zl £j D j F) 2 ]) 1 / 2 (E[(L» £ Zl fc F) 2 ]) 1 / 2 


k^j 

< E (E[(F j F) 4 ]) 1 / 2 (E[(F,F j F) 4 ]) 1 / 2 

j, tel 


+ 22 (n(D j F) A ]) 1 /\E[(D k F) 4 }) l /\n(D e D J F) 2 }) 1 / 2 (E[(D £ D k F) 2 ]) 1 / 2 . (5.7) 

j,k,l£l 

k+3 

We consider the two summands of the last estimate separately. Denote by p 4 the fourth 
moment of a Bin(n — 2,p 2 )-distributed random variable. Using (5.2) and (5.3), we see that 


22 (E[(D,F) 4 ]) 1 / 2 (E[(D,D j F) 4 ]) 1 / 2 

j,tel 


(pq) 3 


(pq) 3 


(Var[T]) 2 

(pq ) 3 

(Var[T]) 2 

(pq ) 3 

(Var[T]) 2 


E 

(e 

[(- v 

j,tei 



E 

1/2 

/V 

p 1/2 

j,*eJ 



d /2 p 1/2 ( 

3 

1/2 1/2 3 
/V p ' n° . 


VP1 

Vpq 


DjF 


4 1 \ V 2 


( E I( 


W w i/2 


pq 


2(n — 2) 


(5.8) 
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For the second summand on the right hand side of ( |5.7[ ) we obtain 

^ (E[( J D J F) 4 ]) 1 / 4 (E[(D fc F) 4 ]) 1 / 4 (E[( J D £j D i F) 2 ]) 1 / 2 (E[(D, J D fc F ) 2 ]) 1 / 2 


j,k,ee.i 


(pqf 


(Var[T ]) 2 


Pa P ^-{\ejCe e \=l} ^{\e k ne t \=l} 


j,k/ei 

j¥=k 


C pqf 


(Var[T ]) 2 

C pq ) 3 




(Var [T]) 
(pq ) 3 


1/2 4 

2 /V 

V 2 _l/2_,3 _l/ 2 7 


(Var[T]) 


2^4 P ' n p ' n 


(5.9) 


Comparing (5.8) with (5.9) one can see that the second summand in (5.7) determines the 
asymptotic behaviour of Ai, since p 1//2 n = 0 1 / 2 n 1_ “/ 2 -A oo, as n —>• oo. By use of (5.1) and 


(5.6) we obtain 


( P <?) 3 1/2 4 

(vRr !)^ 4 


0 4 n 2+2 “ if 0 < a < \ 
6 ~i n“i +Q if g < a < 1 . 


(5.10) 


Combining (5.7), 


, (p.9| and (5.10) yields that 


Ai = ( (n(D j F)\D k F) 2 ]) 1 /\E[(D £ D j F) 2 (D £ D k F) 2 }) 1 ^ 


1/2 


j,k,tei 
' 0 (n ~ 1 + a ) 

3 


if 0 < a < l 


(5.11) 


|C7(n 4 + 2 ) if i < a < 1 . 

With the same arguments as above and by using the additional information on the asymptotics 


of the third moment of a Bin(n — 2 ,p 2 ) random variable from (5.6), we obtain the following 
bounds for A 2 , A 3 , A 3 , Ag and A 7 : 


An, = 


A 3 — 



if 0 < a < 2 

if 2 < a < 1 

if 0 < a < 2 


A 3 = 


<D(n 2 + 2 ) if 2 < a < 1 , 

0 (n~^ + ^) if 0 < a < 2 
if 2 < ol < 1 ■ 
To describe the asymptotic behaviour of 



if 0 < a < \ 
if i < a < 1 , 


4 ) if0<a<2 


if 2 < ol < 1 


A 7 = 


(5.12) 

(5.13) 

(5.14) 


* - ww* z a. 


(E[|71 fc E| 2s ]) 1 / s (E[|Zl fc F|*]) 1 / t 


with r,s,t E (l,oo) and ^ + ^ + t = 1, we use the following moment asymptotics, which is 


taken from the proof of 34. Theorem 2], As n —> 00 , it holds that 
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E [F k 


0 


k\ 


if k E N is odd 
if k E N is even . 


(5.15) 


( ( fc / 2 )! 2 fe / 2 

We will choose r in such a way that A 4 converges to zero at least as fast as all the other 
terms A \,..., A-j that have already been computed. So, fix an even integer r > 2 and choose 
s,t £ (l,oo) such that ^ ^ \ = 1. For (5 E [l,oo) let pp be the moment of order /3 of a 

Bin(?r — 2, p 2 ) random variable. Using ( |5.2[ ), we obtain 

1 :(E[|D fc F| 2s ]) 1 / s (E[|H fc F| t ]) 1 /* 


Vpq 

1 


{pq ) 3/2 


VW (Var[T]) 3/2 V VPV 


E 


^D k F' 2 - 


1/S 


E 




1 /t 


(Var[T]) 


pq i/s i/t 

Jj2^s /V • 


(5.16) 


Note that the absolute values are omitted since D^F is non-negative. Resorting to (5.1) and 
(5.6) and using that ^ | = 1 — we get 


6 2 n - 


__5_ 1 _7 I 3a I 2a_(1 

0 2 rn2^2' h r r 


pq i/s 1 /t 

~ - r ,3/2 ^2 *V 

(Var[r]) 3/2 

Combining (5.15[), (|5.16[) and (5.1 7|) , we obtain that for all even integers r > 2, 


if 0 < a < \ 
if \ < a < 1 


0(n“ 1+ 5") 


^■4 — 


0(n- 


3a I 2a 
"2 ' r " 


a) 


if 0 < a < ^ 
if | < a < 1. 


(5.17) 


(5.18) 


If 0 < a < ^, the bound in ( 5.18[ ) does not depend on r and is of lower order compared to 
the bounds in (5.11)—(5.14). In the case | < a < 2 the term A\ in (5.11) determines the 
leading-order asymptotics among the bounds in (5.11)—(5.14) if r > 2 is chosen in such a way 
that 


3 3 2 1 5 5 

~~ x + — oc -< — — + — a, 

2 2 r r 4 4 


or, equivalently. 


r > 


4(2a - 1) 
1 — a 


Namely, we put r as the smallest even integer larger or equal to max {2, 4( - 2q ; q 1 ^ } and conclude 
from (5.18) that 


_ 0 (n~^ f) 

I 0 [n W 4 ) 


if 0 < a < \ 
if j < a < 1 • 


Moreover, if | < a < 1, the Kolmogorov distance is dominated by the term A§. 
concludes the proof. 


(5.19) 

This 

□ 


After having established Theorem 1.1 we turn to the proof of Theorem 1.2 Recall that in 
this situation p E (0,1) is fixed and that T is a graph with at least one edge. Furthermore, S 
is the number of copies of T in G(n,p ) and F := (S — E[S , ])/y / Var[,S'] denotes the normalized 
sub-graph counting statistic. Let us recall from 110, Lemma 3.5] that 

Var[5] x c(p, T) n 2v ~ 2 , (5.20) 
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where c(p, T) G ( 0 ,oo) is a constant only depending on p and T, and where v = u(r) stands 
for the number of vertices of T. Finally, we recall that I stands for the set {1, • ■ ■, ( 2 )} and 
put q := 1 — p. 

Proof of Theorem 1.2. First, we assume that n > v > 4. Note that for k 6 I, and ST are 
the number of copies of T if edge e k is present in G(n,p) or not, respectively. Thus, — S"jf 
is the number of copies of T in G(n,p ) sharing edge e k . Since there are ("Zq) choices for the 
remaining vertices needed to build such a copy, we have that 


D k F = 


Ft - SO = C(n-‘), kel , 


\/VaI(Sl 

where we also used (5.20). Next, we consider the second-order discrete gradient 

pq 


P>iD k F = 


- ( s t)7 - (F)t + (F)7) . WO. 


/MSI 

If \e k n ef\ = 0, v — 4 further vertices are needed to build a copy of T containing the edges e k 
and e^. Since there are (“I 4 ) choices for these vertices and because of (5.20), one has that 

F> t D k F = 0(n~ 3 ). (5.21) 

Similarly, if |e^ D e^l = 1 we find that 

D £ D k F = 0(n~ 2 ) (5.22) 

and if \e k D e £ \ = 2 we have k = £ and hence 

D e D k F = 0 . (5.23) 


We can now evaluate the terms arising in Theorem 4.1, which we denote by ..., A 7 . For 
A\ we have that 

1 5 

4?:= t (E[(^T) 2 (T fc F) 2 ]) 1 / 2 (E[(^Z}/) 2 ( T f T,F ) 2 ] ) V 2 . 

Using the Cauchy-Schwarz inequality, we see that 

eel kei 

and a distinction of the cases \e k D e £ \ = 0, \e k Pi e £ \ = 1 and \e k D e £ \ = 2 yields 

A\ = 0(n _1 ) 

by (5.21), (5.22) and ( |5.23| ). Similar considerations with r = 2 and s = t = 4 lead to 

A 2 = 0(n~ 2 ), A 3 = 0(n~ 1 ), A A = 0(n ~ l ), 

4 5 = 0{n ~ l ), A 6 = 0(n- 3 / 2 ), A 7 = 0(n~ 5 / 2 ) 

and hence to cIk(F,N ) = 0(n _1 ). 

The case that T has exactly two vertices is covered by the classical Berry-Esseen theorem for 
a binomial distribution with parameters (!)) and p. If T has exactly three vertices, then T 
is either the complete graph on 3 vertices (as already covered by Theorem 1.1) or a graph 
with 1 or 2 edges on 3 vertices, respectively. In these cases, instead of (5.21) one has that 
D £ D k F = 0 if \e k D e £ \ = 0 and one obtains that d,K(F,N) = ©(n^ 1 ). This completes the 
proof. □ 
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Finally in this section, we t urn to the proof of Theorem 1.3 for which we use the same set-up 

In particular, we denote by I the set {1,..., Q)} and recall 
with suitable a£K and 9 E (0, n a ) such that 0x1. We also put q := 1 — p. 


as in the proof of Theorem 
that p = On 


For d E {0,1,2,...} we denote by V n< d the number of vertices of G(n,p ) with degree d and 
put G n ( j := (1 — IE[V^ i rf])/^/Var[V r rii( i]. Let us recall from Chapter 6.3 in 

2 9n 2 p = 2 0n 2 ~ a , 


10 


that 


Var[I4,c 


aE [1,2), 


(5.24) 


and for d E N, 

Var[V^ )( i] x c(d, 9) n d+l p d = c(d, 9) n d + 1 ~ ad ; aE [1,1 + 1/d), (5.25) 

with a constant c(d,6) E (0, oo) only depending on d and on 6 . From Theorem 8 in [2| it is 
known that a central limit theorem for G n> o holds if and only if n 2 p —y oo and np— log n -A —oo, 
as n -A oo. In our case that p = 0n~ a this is equivalent to a E [1, 2). Moreover, 10, Theorem 
6.36] says that for d E N, G n d satisfies a central limit theorem if and only if n d+ 1 p d -A oo 
and np — logn — d log log n -A —oo, as n —> oo. Again, in our case this is equivalent to 
a E [1,1 + 1/d), whence the conditions on a in (5.24) and (5.25). 


Proof of Theorem 1.3. At first, we notice that adding or removing an edge from G(n,p) can 
change the number of vertices with degree equal to d E {0,1,2,...} by at most 2. This implies 
that 


I -^kG n ,d\ — 


2 y/P<i 


VVar [14,d] 


k E I. 


(5.26) 


Next, we observe that (pq)~ x \DkD(V n ^d\ £ {0,1, 2} for all k,£ E I. We also have that V nt d 
and hence DkDiG U) d is zero whenever the two corresponding edges e*, and et are identical or 
do not share a common vertex. Resorting to the definition of the random variable G Ui d, we 
thus conclude that 

2 pq 


\D k D(G, 


n,d\ — 


^{|e fc ne£|=l} ) k,£ £ I. 


(5.27) 


v/Var [V nA ] 

We can now evaluate the bound in Theorem 14.11 We start with the case d = 0. Since the 
computations are almost identical for each of the terms there, we restrict to the first term 
Ai , which is given by 


4 / 15 

Ai := [ — 


^ (E[(H i G n ) 2 (H fc G n ) 2 ]) 1 / 2 (E[(Il^ J G' n ) 2 (^ A: G r , 


Using ( 5.26| ) and ( 5.27[ ), we see that 
60 (pq ) 3 


A\ < 


(Var [V n , d ]f 


1 {\e j ne e \=l} iflefcne^l} = 


60 [pqY 


j,k,££l 


(Var[K,d]) s 


n 


( 2 (n — 2 )) s 


(5.28) 


by A 2 ,... ,A 7 , 


Now, we use that p = 9n~ a as well as the variance asymptotics at (5.24). This allows us to 
conclude that A\ = 0(n ~ a / 2 ). Denoting the other terms arising in Theorem ■ 
we conclude by similar computations and by taking r = 2, s = f = 4 that 

A 2 = 0(n ~ a/2 ), A 3 = 0(n~ 1+a/2 ) , A 4 = 0(n ~ 1+a/2 ), 

A 5 = 0{n~ l+a / 2 ) , A 6 = 0(n -1 / 2 ), A 7 = 0{n~ x l 2 ). 
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Figure 3. Embedding in the plane of 33^ of a 2-regular tree 33. 


Thus, d K (G n ,o,N) = 0(n~ 1+a / 2 ). Turning to the case d G N we start again with the term 
A\ and obtain by using (5.28) and ( 5.25[) th at A\ = Q( ri 1 -d-3a/2+ad^ Moreover, one sees for 

rem |4.1| that 

A 4 = o(n 1/2 - 3d/2 ~ a+3ad / 2 ), 
A r = 0(n 1 / 2 - d ~ a+ad ). 


the other terms A 2 ,..., A 7 in Theorem [ 

A 2 = 0{n 1 - d ~ 3a,2+ad ), A 3 = o(n 1 / 2 - 3d/2 - a+3ad / 2 ), 

A 5 = 0{n ~ d ~ a / 2+ad ), A 6 = 0{n 1 / 2 - d ~ a+ad ), 

Thus, for dsN, dK(G nt d, N) = C)(n 1 / 2_3<i / 2_a+3ad / 2 ), provided that a G [1, (3d—l)/(3d—2)). 
This completes the proof. □ 


6. Application to percolation on trees and proof of Theorem 11.41 

Let us recall some notation and embed the objects into the framework of Sections [ 2 ] and 
[4j We denote by S3 an infinite rooted tree such that each vertex has degree bounded by 
D + 1 with D G N. By n £ N, we indicate the finite sub-tree of 33 consisting of all 
vertices with graph-distance at most n from the root. We now embed 33 into the Euclidean 
plane by the following procedure, which is illustrated in Figure |3j The root is mapped to 
the point with coordinates ( 1 , 1 ) and the vertices adjacent to it are mapped to the points 
with coordinates (1, 2),..., (JV( 1), 2) with N(l) < D in an arbitrary order. Next, the vertices 
adjacent to these are mapped onto (1, 3),..., (JV(2), 3), where (from left to right) the first 
points in this list are adjacent to (1,2), the next points to (2,2), etc. Continuing this way, 
the vertices with graph-distance n to the root are mapped onto ( 1 , n + 1 ),..., ( N(n),n + 1 ) 
with N(n) < N(n — 1 )D and the infinite tree 3? is embedded into the upper right quadrant 
of the Euclidean plane. A vertex of the embedded tree with coordinates (i, k) for k G N 
and i G {1,..., N(k)} receives the label 1 + N(l) + ... + N(k — 1) + i and an edge of the 
embedded tree whose adjacent vertices have coordinates (i, k ) and (j,k — 1) for A: G {2,3,...}, 
i G {1,..., N(k)} and j G {1,..., N(k — 1)} finally receives the label of its endpoint minus 
one, i.e. N( 1) + ... + N(k — 1) + i, see Figure [3j This numbering of vertices also corresponds 
to that in Figure [2] 

This construction puts us in the position to interpret our percolation problem on 33 in terms 


of the framework of Theorem 4.1 Namely, for fixed p G (0,1) let (Xk)k& be a sequence of 


independent Rademacher random variables with P(Xk = +1) = p and P(Xk = —1) = 1 — p. 
For each k G N, assign the random variable Xj. to the uniquely determined edge of the 
embedded tree with label k. The random graph 33(p) consists of all edges of the embedded 
tree with label X & = +1 together with their two adjacent vertices. Thus, 3*(p) is described 
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by the Rademacher sequence ( X k ) k& ^ and its restriction 3.~ n (p ) to 3 n is described by a finite 
sub-sequence of (X k ) kG ^. 

For n E N, we denote by C n (p ) the number of connected components of the random graph 
3nip), where, as already discussed in the introduction, by a connected component we un¬ 
derstand a maximal connected sub-graph with at least one edge. By H n (p ) := ( C n (p ) — 
E[C n (p)])/v / Var[CV^(p)I we denote the normalized version of C n (p ) and notice that C n (p ) is 
a Rademacher functional. 


Proof of Theorem \1.4\ We start by investigating the first- and second-order discrete gradient 
applied to H n (p). By definition, we have that 


D k H n (jp) = 


\/PQ 


•yWar \C n (p)\ 


D k C n {p ) = 


Vm 


^\ai[C n {p)} 


{(c n (p))t - {c n ( P ))- k ) , 


where k E {1,..., 1 + -/V(l) + ... + N(n)}. Note that D k C n (p) is a local quantity since it 
depends only on the edges adjacent to k. Adding or removing the edge with label k can 
change the number of connected components by at most 1. Therefore, we have that 


\D k H n {p)\ < 


Vm 

yWa T\C n (p)\ 


( 6 . 1 ) 


for all k E {1,..., 1 + 1V(1) + ... + N(n)}. Next, we consider for k,j E {1,..., 1 + N( 1) + 
... + N(n)} the second-order discrete gradient 


D k DjH n (p) 


pq 

^/Var [C n (p)] 


{((C n (p))+ 


+ 

k 


(( Cn(p))p k 


(((C n (p))j)i + ((C n (p))j) k )) . 


For most choices of j and k, D k DjH n (p ) is zero. A non-zero contribution only arises if the 
edges ej and e k with labels j and k, respectively, share precisely one common vertex. We 
indicate this situation by |ej n e k \ = 1 and write \ej n e k \ E {0,2} otherwise. Thus, we can 
use the triangle inequality and the estimate (6.1) to conclude that 


[ = 0 

DjD k H n (p)\ \ ^ 2 pq 

x/Var \c n (p)] 


if |ej n e k \ G {0, 2} 
if | ej n e k | = 1. 


( 6 . 2 ) 


We use a lower bound for the variance of C n (p), which can be found in 37, Identity (2.3)] 
in case of a H-regular tree, but the proof is easily seen to carry over to our situation. More 
precisely, there exists a constant c(p) > 0 only depending on p such that 


Var [C n (p)\ > c(p) \3' n 


(6.3) 


6.3) 
L3) 

that 

d K {H n (p),N) = 0(\X\- 1/2 ) ■ 


Estimating the terms in Theorem |4.1| with r = 2 and s = t = 4 there by means of (6.1)-( 
yields (after a straight forward computation similar to the one in the proof of Theorem 


In case of a H-regular tree, we have that 1341 = D + ... + D n = ( D n+1 — l)/(D — 1) — 1, if 
D > 2, and \34\ = n, if D = 1. Thus, \34\ behaves like D n , if D > 2, and like n, if D = 1, 
as n — > oo. This completes the proof. □ 
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